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ABSTRACT 


This  report  reviews  various  mathematical  models  of  use 
in  relating  ship  readiness  and  system  reliability  to  repair 
capability.   It  includes  a  model  that  describes  the  possible 
effects  of  preventive  maintenance  upon  readiness.   Certain 
parameters  in  the  models  are  influenced  by  the  quality  of 
personnel  available. 


Preuared  bv: 


ANALYTICAL  MODELS  FOR  SUPPLEMENTING  SHIP  MANNING  SIMULATIONS 

Donald  P .  Gaver 
Naval  Postgraduate  School 

1.   Introduction. 

Complex  computer  simulations,  such  as  that  dubbed  SHIP  II  by 
the  Naval  Personnel  Research  and  Development  Lab,  have  been  developed 
for  the  purpose  of  studying  the  effectiveness  of  manpower  on  board  a 
ship.   These  simulations  attempt  to  adhere  closely  to  Naval  manning 
doctrine,  and  are  elaborate  and  detailed,  require  extensive  data 
inputs,  and  are  expensive  to  run. 

I  suggest  that  such  complex  models  are  often  usefully  paralleled 
and  supplemented  by  a  family  of  simple  analytical  or  mathematical 
models,  i.e.  mathematical  structures  that,  while  simplified  and 
abstract,  can  be  manipulated  by  mathematical  techniques.   While  such 
models  may  appear  oversimplified  or  naive,  their  true  function  is  to 
provide  a  broad-brush  picture  of  an  admittedly  complicated  situation 
and  to  help  explain  qualitative  phenomena  revealed  by  simulation  runs. 
The  existence  of  a  mathematical  or  probabilistic  model  that  is  suscep- 
tible to  mathematical  or  simple  numerical  manipulation  also  provides 
one  way  of  checking  an  elaborate  (and  accident  prone)  simulation 
program.   In  addition,  such  a  model  may  serve  well  as  a  variance  reduc- 
tion agent;  see  the  account  of  the  "control  variables"  technique  in 
Gaver  [2],  and  its  application  in  Gaver  and  Shedler  [3]. 

In  this  paper  I  will  outline  and  explore  a  selection  of  the  kinds 
of  mathematical  models  that  seem  especially  relevant  in  the  SHIP  II 
context.   Further  elaborations  will  be  deferred  for  study  in  later  reports 


2.   The  "Repairman"'  Model  Type. 

The  SHIP  II  model's  impact  for  manpower  planning  stems  from 
its  capacity  to  relate  ship  readiness  to  manning.   That  is,  certain 
tasks  must  routinely  be  required  of  the  crew  (evolutions,  watch 
standing,  etc.),  while  others  arise  in  an  unpredictable  or  random 
manner,  as  is  true  of  corrective  maintenance  that  is  called  for  when 
crucial  equipments  or  systems  fail.   If  such  failures  happen  close 
together  in  time  it  may  be  that  waiting  will  occur  for  the  beginning 
of  repair  if  few,  or  inexperienced,  repairmen  are  aboard  ship.   The 
reason  is  that  repair  backlogs  will  tend  to  exist.   We  wish  to  analyze 
the  relationship  between  the  supply  of  repairmen  and  the  readiness  or 
availability  of  key  equipments  aboard  ship. 

Classical  models  or  model  types  that  relate  to  the  above  ques- 
tion are  the  "repairman1'  models;  see  Feller  [1],  and  Morse  [4].   We 
outline  the  usual  model  assumptions  and  their  most  transparent  conse- 
quences.  Then  we  discuss  realistic  modifications. 
(2.1)   Simplest  Assumptions. 

Suppose  that  m(m^l)   failure-prone  equipments  are  maintained 
by  a  group  of  repairmen.   Classically,  individual  repairmen  were 
authorized  to  carry  out  service,  but  in  the  real  context  teams  of 
various  constitutions  and  sizes  are  required.   For  the  moment  suppose 
that   r(r^l)   teams  are  available.   Let  all  equipments  have  the  same 
basic  failure  rate,   A,   and  let   y  denote  the  repair  rate  (so  — 
is  mean  or  expected  repair  time) .   A  single  instance  of  repair  time, 
R,   is  an  exponentially  distributed  random  variable,  and  the  time  to 
failure  of  an  individual  equipment  is  also  exponential.   All  of  these 


variables  are  mutually  independent.   Already  we  take  note  of  several 
heroic  simplifying  assumptions,  some  of  which  can  be  relaxed  in  a 
straightforward  manner,  as  we  will  show.   For  example,  failure  times 
of  different  equipments  will  not  have  the  same  exponential  distribu- 
tion, nor  will  repair  times.   In  particular,  repair  times  do  not  now 
reflect  behavioral  factors  such  as  learning  and  training  (neither 
does  the  current  version  of  SHIP  II) .   But  if  a  complex  simulation, 
say,  allows  for  arbitrary  distributions,  then  when  we  specialize  to 
exponentials  the  simulation  should  give  the  same  answers  as  do  our 
models.   Thus  our  models  furnish  a  means  of  checking  internal  validity, 
To  the  extent  that  the  specific  distributional  assumptions  are 
irrelevant  (for  example,  results  may  depend  only  upon  the  means  of 
the  distributions)  then  the  models  furnish  formulas  that  can  actually 
replace  time-consuming,  expensive,  error-prone  simulations.   This  is 
something  to  consider  and  exploit. 

Finally,  the  models  assume  that  repairs  are  conducted  in  the 
arrival  order  of  the  failures,  without  priority  assignments.   And  only 
the  long-run  or  steady  state  solutions  are  found. 
(2.2)   Model  Questions  and  Answers. 

Let  P  (t)   denote  the  probability  that  exactly  n  equipments 
n 

are  "down"  for  repair  at  time   t  after,  say,  a  mission  or  tour  com- 
mences.  Presumably  then  P  (0)  =  0   for  n  =  l,2,...,m,   while 

n 

Pn(0)  =1   if  the  shore  repair  facilities  are  successfully  preparing 
the  vessel  for  its  tour. 


Now  Feller  ([1],  Chap.  XVII,  Section  7)  describes  the  way  in 
which  the  latter  problem  may  be  formulated  as  a  birth  and  death  process 
I  will  not  repeat  the  mathematical  arguments  here.   In  summary  the 
results  and  their  usefulness  in  present  context,  are  as  follows. 

(a)  Differential  equations  are  derived  for  P  (t)   in  terms  of 

A,  u,  m,   and  r.   The  latter  are  capable  of  explicit  solution 
in  terms  of  exponentials,  but  these  solutions  are  complicated. 
Of  more  use  are  numerical  solutions,  presented  graphically. 
For  example,  one  can  tabulate  the  average  (mean,  or  expected) 
number  of  equipments  down  (unready)  and  undergoing  or  awaiting 
repair  at  a  particular  time  during  a  mission  as  the  latter 
depends  upon  manning  level,   r. 

(b)  It  is  pointed  out  that  as   t,   the  time  that  elapses  following 
mission  start,  becomes  large   (t-*»)  ,   then  P  (t)   approaches 
a  limit,   p  .   This  limiting,  or  steady-state  probability 
distribution  can  be  found  nearly  explicitly.   Actually  it  is 
easiest  to  write  down  the  "probability  balance  equations" 


(n+l)u  pn+1  =  (m-n)A  p  ,     n  <  r 


r  u  pn+1  =  (m-n)A  pR,         n  :>  r 


(2.1) 


and  solve  them  numerically  on  a  computer.   That  is,  we  start  with 

p  =  k  p.,   k_  =  1,   substitute  into  the  above  equations  and 
n    n  0    0 

cancel  out  p~   and  solve  successively  for  k, ,k.,...,k  .   Then, 
0  1  I  m 

since 


m  m 

I      Pn  -  1  -  PQ  I      k  (2.2) 

n=0  n=0 


pn  may  be  determined. 

These  "steady  state"  solutions  can  be  used  to  calculate  an 
approximation  to  the  expected  number  of  equipments  down  (unready)  after 
the  mission  has  been  under  way  for  some  time.   Feller  [1]  shows  that 
the  expected  number  of  machines  in  line,  waiting  to  begin  processing 
or  repair,  is 


w-m-  (^)(1-Pn)  (2-3) 


Even  if  we  don't  know  p  ,   we  do  know  that  it  lies  between  zero  and 
unity,  so  very  crudely 

m  -  f— 7-^)  ^  w  £  m,  (2.4) 

A 

and  better  approximations  can  be  derived  with  some  effort.   Work  is 
currently  underway  to  build  useful  mathematical  approximations  to  such 
problems  based  on  simplifications  that  occur  when  m  becomes  large. 
But  much  that  is  useful  as  a  supplement  to  (or  replacement  of)  simu- 
lations follows  from  adaptations  of  the  above  results. 
(2.3)   Alternative  Assumptions  and  Results;   The  Infinite  Server 

Approximation. 

The  assumptions  in  the  previous  section  can,  with  benefit,  be 
relaxed  in  one  direction,  and  made  more  general  in  another,  if  a 
different  model  is  constructed. 

I  will  again  assume  that  equipments  break  down  at  random,  i.e. 
the   i —  equipment  has  failure  rate  X.       (i  =  1 ,2 , 3 , . . . ,m) .   The 


repair  time  distribution  of  the   i —  equipment  is  F.(x),   where 

F.(x)  -  P{R±  <.   x}  (2.5) 

is  a  completely  arbitrary  distribution,  e.g.  the  log-normal.   This 
is  a  more  general  assumption  than  was  made  earlier,  for  there  R.   was 
exponential.   Furthermore,  assume  that  the  repair  process  is  such  that 
there  is  no  waiting  to  begin  repair  because  of  the  presence  of  other, 
prior  failing,  equipments.   This  rather  strong  assumption  can  perhaps 
be  justified  if  equipments  do  indeed  fail  infrequently  enough,  and 
repair  times  are  short  enough.   It  will  not  be  justified  if  there  are 
only  a  few  repairmen  (or  teams) ,  as  would  be  the  case  if  there  were  a 
drastic  reduction  in  force,  or  if  the  quality  of  repair  service  degen- 
erated, i.e.  repair  times  increased  because  of  inexperienced  crews. 
Results  that  may  be  obtained  (we  omit  details)  are  as  follows, 
(a)   The  long-run  or  steady-state  probability  that  a  given  system, 
System  i,   is  undergoing  repair  is 


X.E[R.] 
i   i 

P{System  i  unavailable}  =  - — ; — - — r  -.  (2.6) 

1   +  A  .  h,  [K .  J 


Actually,  one  can  find  the  probability  that  System  i   is,  or  is 
not,  available  at  any  time   t   following  mission  start  with  some- 
what more  difficulty. 

Note.   The  above  formula,  (2.6),  holds  regardless  of  whether 
failure  times  are  exponentially  distributed  or  not.   It  turns  out 
that  long-run  availability  depends  only  upon  mean  or  expected 


time  to  failure,   A.  ,   in  this  model.   Consequently,  even  if  times 
to  failure  have  the  Weibull,  log-normal,  or  gamma  distributions, 
our  formula  holds  well,  and  can  .be  used  to  check  out  or  validate 
simulations . 
(b)   As  a  result  of  (a) ,  the  expected  number  of  systems  that  are  unavail- 
able is 


I    A.E[R.] 
r1         i 
E[#  of  Systems  unavailable]  =   )       — r-f— — r  , 

.  -.  1  +  A .E [R. J 
1=1       11 


and  the  variance  of  the  number  of  unavailable  equipments  is 


I     A  E[R  ] 
Var[#  of  Systems  unavailable]  =   )  ,       — =7- — 777 


(c)   Furthermore,  the  probability  that  all  systems  are  available  is 


I      , 

P{all  Systems  available}  =  H  (    ^    ) 

.  n  ^1+X .E[R. y 
i=l    11 


because  systems  are  supposed  to  fail  and  be  repaired  entirely 
independently.   This  need  not  be  a  good  model,  but  can  be  used 
to  check  internal  validity, 
(d)   Finally,  if  all  X.E[R.]   terms  are  of  about  the  same  magnitude 
and  are  small  compared  to  unity,  as  should  frequently  be  the 
case,  then  the  distribution  of  the  total  number  of  Systems  out 
or  unavailable  will  be  approximately  Poisson  distributed. 


Note.   The  above  formulas  are  simple,  explicit,  and  hence  easy 

to  apply.   They  should  approximate  actual  system  availability 

late  in  a  mission  if  the  failure  rate  is  reasonably  small;  the 

restriction  to  "late  in  a  mission'1  is  necessary  because  our 

results  are  for  the  steady  state,  which  requires  time  to  establish 

itself.   However,  our  steady-state  results  (average  number  of 

systems  down,  for  example)  will  typically  be  more  optimistic  than  they 

should  be,  because  of  the  assumption  that  a  repair  may  always 

begin  immediately.   Thus  our  simple  models  can  be  immediately 

employed  to  generate  an  upper  bound  on  availability  or  readiness. 


3 .   More  Realistic  Repairman  Models. 

Certain  simplifying  assumptions  made  in  the  models  just  described 
may  be  rectified  with  only  a  little  trouble.   The  changes  we  suggest 
will  make  the  results  more  credible  and  valid,  but,  as  seems  inevitable, 
there  will  be  costs.   The  costs  present  themselves  in  the  form  of 
increased  mathematical  complexity  in  model  formulation  and  manipulation. 
Nevertheless,  these  particular  difficulties  are  quite  surmountable  if 
a  digital  computer  is  available  to  carry  out  computations.   Other  costs 
relate  to  the  development  of  a  better  understanding  of  the  maintenance 
process,  the  gathering  of  relevant  data  on  the  effectiveness  of  training, 
etc.   Such  data  might  come  from  fleet  experiments  and  service  school 
experience.   At  the  moment  our  models  can  only  answer  interesting  "what 
if"  questions,  consuming  judgmental  data. 
(3.1)  Individualizing  the  Repairman  Models. 

The  basic  repairman  models  assume  that  failure  rates  (demand 
for  repairs)  of  equipments  are  equal.   In  fact,  this  is  rather  unlikely. 
Let  us  assume  instead  that  there  are  m  equipments  with  failure  rates 
A.   (i  =  l,2,...,m),   and  one  repairman  (or  crew)  that  services  them 
all.   This  latter  restriction  may  easily  be  removed,  and  we  make  it 
only  to  illustrate  in  a  simple  way  the  new  analysis,  which  is  straight- 
forward but  tedious  and  is  not  in  the  standard  books.   We  will  assume 
that  each  equipment  has  its  own  repair  rate  (repair  times  are  expo- 
nentially distributed):   repair  rate  for  system  i   is  u.   (i  =  l,2,...,m) 

Because  the  failure  rates  differ  it  is  necessary  to  specify  a 
descriptive  set  of  states  for  the  entire  system.   I  will  first  illustrate 
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this  for  only  two  systems  or  machines  (e.g.  a  sonar  and  a  computer), 
leaving  it  up  to  the  reader  to  generalize. 
The  Two-Equipments,  One  Repair  Group , "  System. 

Consider  the  following  possible  states  in  which  the  system  may 
find  itself.   Of  course,  the  states  may  be  identified  in  any  way, 
e.g.  by  letters;  the  numbers  we  use  mean  nothing  intrinsically. 


Equipment  1 

Equipment  2 

State 

Up 

Up 

0 

Up 

Down 

1 

Down 

Up 

2 

Down 

First 

Down  Second 

3 

Down 

Second 

Down  First 

4 

Note  especially  the  last  two  lines  of  the  table:   if  repair  is 
conducted  on  a  first-come,  first-served  basis  the  noted  distinction 
must  be  made.   To  explain,  consider  State  4:   if  this  state  prevails, 
Equipment  1  failed  first  (and  is  undergoing  repair) ,  while  Equipment 
2  is  also  down,  but  is  awaiting  repair.   The  reverse  is  true  in 
State  3.   If  priority  repair  is  carried  out,  with  priority  determined 
by  expected  repair  time  duration,  then  a  simpler  table  (and  fewer 
equations)  may  result. 

It  is  straightforward  to  write  down  the  descriptive  differential 
equations.   We  make  the  usual  assumptions.   Let  P.(t)   be  the  proba- 
bility that  the  system  is  in  state  j.   We  have,  up  to  terms  negligible 
compared  to  dt, 
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PQ(t+dt)  =  P()(t)[l-(X1+A2)dt]  +  Pl(t)n2dt  +  P2(t)y1dt     (3.1) 
so,  subtracting  PQ(t)   from  both  sides  and  taking  limits  yields 

dPo 

dt-=  -(X1+A2)P0  +  ^2P1  +  ^1P2  (3'2) 

the  steady-state  equation  is 


(X1+X2)p0  =  u2Pl  +  U;lp2  (3.3) 

By  a  similar  argument  we  write  down 


(X1+X2)p1  =  X2pQ  +  y1p3  (3.4) 


(U]+X2)p2  =  XlP()  +  y2p4  (3.5) 


ylP3  =  X2P2  (3,6) 


M2P4  =  X1P1  (3,7) 


Furthermore, 


P0  +  Px  +  P2  +  P3  +  P4  =  1  (3.8) 


since  exactly  one  state  must  prevail  at  any  time.   Thus,  in  order  to 
find  the  probabilities  we  must  leave  out  one  equation  among  (3.3)-(3.7), 
but  retain  (3.8).   In  truth,  these  equations  may  be  solved  simultane- 
ously to  yield  an  algebraic  formula  in  this  simple  case,  but  it  will 
probably  be  more  effective  to  carry  out  solutions  numerically  after 
plugging  in  suitable  parameter  values.   A  digital  computer  will  do 
this  problem  quite  easily,  utilizing  standard  codes  for  solving  linear 
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equations.   In  general,  the  number  of  equations  will  be  about   2  , 
although  this  represents  a  low  estimate  in  the  present  case.   It  is 
far  easier  to  count  the  number  of  machines  in  each  condition  (up, 
down,  undergoing  repair,  or  waiting)  than  it  is  to  keep  track  of 
each  machine  or  equipment.   This  is  why  the  models  described  earlier 
appear  in  text  books,  and  why  the  later  models,  while  more  realistic, 
are  ignored . 

Note  that  matters  strictly  related  to  manpower  quality  and 
training  specifically  enter  these  models  through  the  repair  rate 
parameters  u . :   the  higher  the  level  of  skill  the  larger  will  be  u., 
and  the  greater  the  fraction  of  time  that  the  equipment  will  be  avail- 
able.  Incidentally,  it  might  be  noted  that  while  the  elaborate  system 
of  states  presented  is  required  for  analysis,  only  a  few  states  are 
really  of  interest:   p„   is  the  probability  that  both  equipments  are 
entirely  available,   p1   is  the  probability  that  Equipment  1  is 
available  and  Equipment  2  is  not,  while  p   is  the  reverse  (Equipment 
2  up,  Equipment  1  down). 

It  should  be  clear  that  equations  of  the  same  form  can  be  written 
to  describe  more  complex  set  ups  (more  machines,  and  more  teams).   Again, 
although  our  model  is  simple  it  will  at  least  be  useful  for  checking 
the  internal  validity  of  the  SHIP  II  simulation  program.   Of  course, 
the  solutions  obtained  are  not  time-dependent:   we  would  anticipate 
higher  readiness  towards  the  start  of  a  cruise  or  mission  than  at  the 
end  just  because  all  systems  should  be  initially  operative.   At  another 
time  we  will  discuss  time-dependent  or  transient  solutions.   These  allow 
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us  to  trace  changes  in  readiness  as  a  mission  progresses;  the  latter 
changes  may  well  reflect  a  repair  capability  that  is  inadequate  to 
keep  up  with  failures. 
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4 .   Preventive  Maintenance  and  Repair. 

Widely  held  conventional  wisdom  teaches  that  preventive 
maintenance  (p.m.)  can  eliminate  or  postpone  chance  failures  that  may 
occur  during  an  operation.   Thus  preventive  maintenance  may  improve 
readiness,  and  is  included  as  part  of  the  normal  workload  for  that 
reason.   Several  comments  may  be  made,  however. 

(a)  Some  preventive  maintenance  activities  may  actually  reduce 
readiness,  for  example,  if  carried  out  by  inexperienced  and 
unmotivated  personnel.   Certainly  equipment  is  "down"  while 
testing  and  maintenance  is  under  way,  and  logistics  problems 
may  be  created  if  unnecessary  replacements  are  made.   A  service 
school  (for  technicians)  should  provide  a  good  environment  for 
deciding  about  the  efficiency  of  preventive  maintenance  involving 
various  equipment  types  and  differing  levels  of  technician  compe- 
tence.  An  experimental  program  could  be  designed  for  dealing 
with  just  this  question. 

(b)  The  present  SHIP  II  model  recognizes  the  existence  of  p.m.  as 
part  of  workload,  but  does  not  relate  the  effort,  or  its  skill 
level,  to  altered  time  between  failures.   While  an  exact  relation- 
ship would  be  difficult  to  establish,  a  reasonable  class  of  models 
can  be  constructed  that  reflect  the  behavior  desired,  and  that 
allows  a  decision  maker  to  answer  "what  if?"  questions  concerning 
reliability  and  maintenance  activities. 

I  believe  that  the  formulation  of  such  an  explicit  model  will 
stimulate  further  inquiry  into  the  interrelationship  between  p.m.  and 
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readiness.   I  have  no  belief  that  the  simple  models  suggested  here  are 
precisely  valid,  suggesting  only  that  they  are  more  representative  of 
"real  life"  than  is  a  model  that  ignores  any  possible  relationship. 
4 .1  One  System,  One  Repair  Team. 

In  order  to  make  a  beginning,  we  imagine  that  one  system  (a  sonar, 
for  example)  is  served  by  one  repair  team.   The  latter  is  responsible 
for  p.m.  and  corrective  maintenance  (cm.).   We  suppose  that  p.m.  is 
intended  to  assure  that  cm.  incidents  occur  as  rarely  as  possible. 

Model  A.   Let   Xu  be  a  high  failure  rate  (short  MTBF) ,   XT   be 
a  low  failure  rate  (long  MTBF),  (so   XT  <  X„) ,   u  be  the  repair  rate, 
and   v  be  the  rate  at  which  p.m.  activities  are  conducted.   The  signi- 
ficance of  the  high  failure  rate  vs.  low  failure  rate   (X  vs.  X  ) 
contrast  is  that  presumably  the  low  failure  rate  occurs  if  correct, 
successful,  p.m.  is  performed,  while  otherwise  X    is  in  effect.   We 
also  introduce  a  probability  structure  for  the  adequacy  of  p.m.   If 
current  system  state  is  H   (so  that   X   prevails)  and  if  a  p.m.  is 

n 

performed  then  we  let  h   (0  £  h  £  1)   be  the  probability  that  the 
system's  state  remains  H,   while  with  probability  h  =  1  -  h  it  is 
inadvertently  shifted  to  L.   Similarly,  if  the  system  is  in  L,   let 
I     be  the  chance  of  remaining  in  L,   while   1=1-1      is  the  chance 
of  passing  to  H  when  p.m.  is  performed.   Thus,  "good"  p.m.  is  charac- 
terized by  h  near  unity,  and  I     near  zero.   A  little  reflection 
indicates  that  if  the  opposite  is  true  then  it  may  be  desirable  to  have 
a  small  value  for   v   (p.m.  rate  or  the  rate  of  introducing  p.m.  activi- 
ties)— thus  meaning  that  the  time  between  detrimental  tinkerings  is  long. 
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We  also  introduce  another  set  of  parameters  analogous  to  h 

and  I      in  order  to  describe  the  results  of  cm.   For  the  moment  we 

shall  assume  that  when  a  failure  occurs  it  is  repaired  at  rate  u, 

regardless  of  its  failure  state  just  before  failure.   Also,  we  will 

assume  that  it  is  restored  to  the  low  failure  rate   (L)   state  by 

cm.  with  probability  tt  ,   and  to  the  high  rate  with  probability  tt  . 

These  probabilities  are  independent  of  the  past.   They  reflect  the 

quality  of  repair  service,  and  hence  of  the  assigned  personnel,  as  is 

the  case  with  h   and  H . 

The  values  of  I     and  h,   and  of  tt   and  tt  ,   may  differ 

L         rl 

because  of  differing  personnel  policies.   For  instance,   h  and 
may  be  low,  yet  tt   is  high,  simply  because  new  or  relatively  ill- 

rl 

trained  personnel  are  assigned  to  preventive  maintenance,  but  when  a 
failure  (cm.  incident)  occurs  the  better  people  are  brought  into  the  ball 
game.   Whether  this  is  a  wise  strategy  obviously  depends  upon  the 
relationship  between  the  various  parameters;  many  times  it  will  not  be. 

Now  let  UrjCt)   De  tne  probability  that  our  system  is  up  at   t, 
and  has  failure  rate  X        (i.e.  is  in  state  H) ,   let  ul  (t)   denote 
the  probability  that  the  system  is  up  but  is  endowed  with  A    (is 
in  state  L) ,   and  let  r(t)   denote  the  chance  that  the  system  is  down 
for  cm.  repair.   All  of  our  simple  assumptions — many  of  which  can  be 
immediately  relaxed — indicate  that  we  are  dealing  with  a  simple  Markov 
process  in  3  states.   Thus,  we  can  write  down  differential  equations 
for  u^(t),   ll  (t)  ,   and  r(t).   Alternative  methods  of  analysis  are 
also  useful,  especially  when  exponential  distributions  are  jettisoned. 
Such  methods  will  be  described  later. 
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To  derive  the  differential  equations,  and  then  the  long-run 
behavior,  arguments  go  as  follows.   In  order  for  the  system  to  be  in 
state  H  at   t  +  dt   either  (i)  it  was  in  H  at   t  and  did  not 
change  in   (t,t+dt),  (ii)  it  was  in  L  at   t,   preventive  maintenance 
occurred  (instantaneously,  by  present  assumption) ,  shifting  the  system 
to  H,  (iii)  it  was  on  repair  (cm.)  at   t,   and  repair  was  completed 
in   (t,t+dt);   the  repair  led  to  the  H  state.   Other  events  are 
negligible.   Thus 

u^t+dt)  -  uH(t)[l-vhdt-XHdt]  +  ^(Ovidt  +  r(t)uirHdt, 

leading  to 


d 


"H 


.  =  -[vh+XH]UR  +  v£uL  +  u&Hr(t).  (4.1) 

Next,  a  similar  argument  leads  to  the  equation 
Uj  (t+dt)  =  uL(t)[l-vJldt-XLdt]  +  uR(t)vhdt  +  r(t)uTrLdt; 

leading  to 

duT 

-^  =  -[v£+XL]uL  +  vhuH  +  ryTTL  (4.2) 

Finally, 

^(t)  +  u^t)  +  r(t)  =  1,  (4.3) 

and  one  can  solve  (4.1),  (4.2),  and  (4.3)  simultaneously.   To  obtain 

a  time  dependent  solution  Laplace  transforms  may  be  used;  the  stationary 

solution  is  obtained  by  setting  the  derivatives  equal  to  zero  in  (4.1) 
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and  (4.2)  and  solving  the  resulting  linear  equations.   Explicit 
solutions  will  be  given  and  discussed  later. 

Model  B.   It  is  helpful  to  generalize  the  previous  model,  and 
to  treat  it  in  an  alternative  and  more  general  way.   To  do  so  we  will 
first  write  expressions  for  the  distributions  of  the  time  to  failure 
(cm.  instants)  starting  from  a  repair  completion  time. 

Let 

A^  =  an  up-period  duration,  beginning  with  the  equipment  in  state 
L,   having  just  ended  cm.  (or  p.m.); 

A^  =  same,  but  referring  to  a  cm.  (or  p.m.)  that  initializes  the 
equipment  in  state  H. 

Decomposition  at  the  first  event,  followed  by  application  of  the  convo- 
lution property  of  the  transforms,  gives 


-s 


Y 


aL(s)    E   E[e  ]    = 


-X  x 
e  SX[l-G(x)]e  XTdx 

Li 


o 


+  a    (s)£ 


00  00 

-A    x  r      -A    x 

e  Sx  e     L     G{dx}   +  aH(sH        e     L     G{dx}; 


0 


0 


(4.4) 


and 


-s 


Y 


aR(s)  e  E[e    ]  = 


-Ax 
e  SX[l-G(x)]e  n     Audx 

rl 


0 


+  aR(s)h 


0 


-Ax  (•       "\^x) 

e  SX  e  H  G{dx}  +  aL(s)h   e  SX  e       G{dx} .     (4.5) 

0 
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Here  G   is  the  distribution  of  the  time  between  p.m.  moments,  measured 
from  a  cm.  termination.   Simplification  of  (A. A)  and  (A. 5)  yields  two 
simultaneous  linear  equations  for  the  transforms : 


VS)  =  XL 


1-G(s+X  ) 

Li 


s+X, 


+  I   G(s+XL)aL(s)  +  I   G(s+AL)aH(s)    (A. 6) 


aH(s)  =  AH 


1-G(s+XH) 


s+A 


H 


+  h  G(s+AH)aH(s)  +  h  G(s+AH)aL(s)    (A. 7) 


where  6(s)   represents  the  Laplace-Stieltjes  transform  of  the  time 
between  successive  p.m.  moments.   We  give  two  examples: 

— vx    "  ~ "1 

Example  1 :   G(x)  =  1  -  e    ;   G(s)  =  v(v+s)   .   This  is  the  case  of 

random  occasions  for  p.m.  on  the  particular  equipment.   The  expected 

or  average  time  between  inspection  is  v   .   Such  an  assumption  may 

be  reasonable  if  waits  occur  for  crucial  personnel  who  are  otherwise 

occupied  doing  cm.  or  other  tasks.   It  is  probably  less  realistic 

than  the  next. 

Example  2.   G(x)  =  (0,   x  <  -  ;   G(s)  =  e   V.   This  is  the  case  of 


1,   x  ^ 


v 


regular  inspections  and  p.m.  actions,  at  time  intervals  of  exact,  non- 
random  length  v 

Other  distributions  for  inspection  intervals  are  possible.   One 
reasonable  approach  would  be  to  determine  G(x)   empirically,  either 
from  actual  shipboard  data  or  from  simulations  or  more  extensive  models 

In  order  to  derive  the  stationary  or  long-run  probability  of 


readiness  we  must  have  the  expected  values  of  A^ 


and 


\- 


and  these 
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may  be  derived  by  differentiating  (4.6)  and  (4.7),  and  then  solving 
the  resulting  linear  equations.   Laplace  transform  theory  tells  us 


that  an  equivalent  result  is  obtained  by  computing  the  limits 
l-a(s) 


lim 
s+0 


l-aR(s) 

=  E[A^  ]   and   lim  =  E[A^]  .   If  we  work  directly 

s-*0 


with  (4.6)  and  (4.7)  we  find  the  equations 

-I   G(XL)E[AR]  +  {1  -  I   G(XL)}E[AL]  =  ■  — 


1-G(XL) 


(4.8) 


{1  -  h  G(XH)}E[AH]  -  h  G(XH)E[AL]  =  — 


H 


The  solution  to  this  set  is 


^V 


1-G(XR) 


H 


[1-A  G(XL)]  + 


1-G(XL) 


h  G(XR) 


[1-h  G(XR)][1-  G(XL)]  -Ah  G(XL)G(XN) 


[Aj 


1-G(XL) 


[1-h  G(XH)]  + 


1-G(XR) 


H 


A  G(XL) 


[1-h  G(XR)][1-A  G(XL)]  -  l   h  G(XL)G(XR) 


(4.9) 


(4.10) 


(4.11) 


From  these  we  can  find  a  general  expression  for  long-run  reliability, 
r,   of  the  system  (probability  that  the  system  is  in  the  up  state). 


r  = 


"h  e[V  +  "l  e[al] 


(4.12) 


Optimum  Inspection  Interval 

It  is  intuitively  clear  that  if  tt   is  close  to  one  (so  that 
cm.  usually  returns  the  system  to  the  low  failure  rate  state),  while 
p.m.  often  switches  the  system  from  state  L   to  state  H,   and  if  X 
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is  much  greater  than  X  ,   then  it  is  best  to  avoid  p.m.   In  other 

Li 

words,  there  may  be  an  optimum  p.m.  rate,   v    ,   that  maximizes 

opt 

E[A]  =  7TR  EfAjj]  +  ttl  EtAj,  (4.13) 

and  hence  also  maximizes  the  reliability  r.   For  any  given  set  of 
parameter  values  we  can  of  course  explore  E[A]  as  a  function  of   v 
by  direct  computation.   Very  probably  this  will  be  the  only  feasible 
way  of  proceeding,  especially  when  regular  inspections  and  p.m.  activi- 
ties are  carried  out;  see  Example  2.   In  the  case  of  Example  1  we  can 
actually  find  the  optimum  interval  analytically  or  in  closed  form. 
By  simplification  of  (4.13)  with  the  assistance  of  (4.10)  and  (4.11) 
we  find  that 

tth  XT  +  tt  Xp  +  vU+h) 

E[A]  =      _ k— I — CA-14) 

(Xu+vh)(XT+vS,)  -  £  h  v2 

n       L 

Next,  differentiation  with  respect  to  v   is  carried  out,  and  the 
result  set  equal  to  zero;  the  obvious  concavity  shows  that  at  most  one 
interior  maximum  exists: 

v    =  Optimum  cm.  Interval  = 
opt 

[U+h)(X_  h+Xu  I)    -  h  XT  -I   X  ] 
h 2 h d_    (4.15) 


XH  XL  "  (KXL+I  V[XL  +  1TL(XH-V] 


One  qualitative  fact  that  emerges  is  that  if   tt   is  increased,  the 
optimum  p.m.  interval  also  increases.   The  effect  is  to  postpone  cm. 
actions  that  may  cause  a  change  to  the  high  failure  rate  state.   We 
must,  of  course,  check  E[A]   evaluated  at   v     with  its  value  when 
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v  =  0  or   v  =  °°,   for  the  true  optimum  may  occur  at  one  of  those 
points  for  certain  parameter  values. 
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5.   Concluding  Discussion. 

I  have  attempted  to  present  in  this  report  several  methods  and 
models  that  are  relevant  to  studying  manning  problems  of  the  type 
addressed  by  the  SHIP  II  simulation.   Although  the  models  here  are  not 
as  elaborate  as  those  encompassed  by  SHIP  II,  more  detail  can  be  built 
in  by  taking  a  mathematical  approach.   We  believe  that  such  supple- 
mentary modeling  should  be  carried  out  in  parallel  with  most  complex 
simulations,  if  only  to  check  on  internal  validity.   Very  possibly  a 
simple  analytical  model  can  be  "fitted"  to  simulation  data,  leaving 
certain  parameters  free  for  variation.   An  example  might  be  the  p.m. 
rate  appearing  exogenously  in  our  last  model:   the  latter  will  depend 
upon  manpower  level  and  other  tasks,  but  could  be  estimated  from 
simulation  data  with  the  above  parameters  held  fixed.   We  can  then  use 
the  model  to  estimate  the  effect  of  p.m.  on  reliability  without  actually 
sampling.   This  would  result  in  an  obvious  and  welcome  economy  in 
computational  time.   Avoidance  of  simulation  when  possible  also  avoids 
the  confusion  of  random  error. 

Future  work  in  the  present  area  will  include  consideration  of 
transient  behavior  of  our  repair  processes.   Under  reduced,  or  less 
skilled,  manning,  one  would  anticipate  degradation  of  reliability 
throughout  a  mission.   This  effect  can  be  traced  both  by  simulation 
and  by  mathematical  analysis.   Finally,  we  plan  to  consider  the  presen- 
tation of  the  mathematical  results  in  the  form  of  an  interactive  computer 
program  that  a  user  can  manipulate  from  a  console.   This  enables  an 
analyst  to  change  parameter  values  at  will  and  by  experimentation  build 
up  a  sound  and  useful  feeling  for  the  effects  of  various  system  parameters 
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